Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38598391

RESUMO

In this article, we propose a method, generative image reconstruction from gradients (GIRG), for recovering training images from gradients in a federated learning (FL) setting, where privacy is preserved by sharing model weights and gradients rather than raw training data. Previous studies have shown the potential for revealing clients' private information or even pixel-level recovery of training images from shared gradients. However, existing methods are limited to low-resolution images and small batch sizes (BSs) or require prior knowledge about the client data. GIRG utilizes a conditional generative model to reconstruct training images and their corresponding labels from the shared gradients. Unlike previous generative model-based methods, GIRG does not require prior knowledge of the training data. Furthermore, GIRG optimizes the weights of the conditional generative model to generate highly accurate "dummy" images instead of optimizing the input vectors of the generative model. Comprehensive empirical results show that GIRG is able to recover high-resolution images with large BSs and can even recover images from the aggregation of gradients from multiple participants. These results reveal the vulnerability of current FL practices and call for immediate efforts to prevent inversion attacks in gradient-sharing-based collaborative training.

3.
Cell Rep Med ; 5(2): 101419, 2024 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-38340728

RESUMO

Federated learning (FL) is a distributed machine learning framework that is gaining traction in view of increasing health data privacy protection needs. By conducting a systematic review of FL applications in healthcare, we identify relevant articles in scientific, engineering, and medical journals in English up to August 31st, 2023. Out of a total of 22,693 articles under review, 612 articles are included in the final analysis. The majority of articles are proof-of-concepts studies, and only 5.2% are studies with real-life application of FL. Radiology and internal medicine are the most common specialties involved in FL. FL is robust to a variety of machine learning models and data types, with neural networks and medical imaging being the most common, respectively. We highlight the need to address the barriers to clinical translation and to assess its real-world impact in this new digital data-driven healthcare scene.


Assuntos
Aprendizado de Máquina , Medicina , Humanos , Redes Neurais de Computação
4.
IEEE Trans Med Imaging ; 43(5): 1945-1957, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38206778

RESUMO

Color fundus photography (CFP) and Optical coherence tomography (OCT) images are two of the most widely used modalities in the clinical diagnosis and management of retinal diseases. Despite the widespread use of multimodal imaging in clinical practice, few methods for automated diagnosis of eye diseases utilize correlated and complementary information from multiple modalities effectively. This paper explores how to leverage the information from CFP and OCT images to improve the automated diagnosis of retinal diseases. We propose a novel multimodal learning method, named geometric correspondence-based multimodal learning network (GeCoM-Net), to achieve the fusion of CFP and OCT images. Specifically, inspired by clinical observations, we consider the geometric correspondence between the OCT slice and the CFP region to learn the correlated features of the two modalities for robust fusion. Furthermore, we design a new feature selection strategy to extract discriminative OCT representations by automatically selecting the important feature maps from OCT slices. Unlike the existing multimodal learning methods, GeCoM-Net is the first method that formulates the geometric relationships between the OCT slice and the corresponding region of the CFP image explicitly for CFP and OCT fusion. Experiments have been conducted on a large-scale private dataset and a publicly available dataset to evaluate the effectiveness of GeCoM-Net for diagnosing diabetic macular edema (DME), impaired visual acuity (VA) and glaucoma. The empirical results show that our method outperforms the current state-of-the-art multimodal learning methods by improving the AUROC score 0.4%, 1.9% and 2.9% for DME, VA and glaucoma detection, respectively.


Assuntos
Interpretação de Imagem Assistida por Computador , Imagem Multimodal , Tomografia de Coerência Óptica , Humanos , Tomografia de Coerência Óptica/métodos , Imagem Multimodal/métodos , Interpretação de Imagem Assistida por Computador/métodos , Algoritmos , Doenças Retinianas/diagnóstico por imagem , Retina/diagnóstico por imagem , Aprendizado de Máquina , Fotografação/métodos , Técnicas de Diagnóstico Oftalmológico , Bases de Dados Factuais
5.
Nat Commun ; 14(1): 6757, 2023 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-37875484

RESUMO

Failure to recognize samples from the classes unseen during training is a major limitation of artificial intelligence in the real-world implementation for recognition and classification of retinal anomalies. We establish an uncertainty-inspired open set (UIOS) model, which is trained with fundus images of 9 retinal conditions. Besides assessing the probability of each category, UIOS also calculates an uncertainty score to express its confidence. Our UIOS model with thresholding strategy achieves an F1 score of 99.55%, 97.01% and 91.91% for the internal testing set, external target categories (TC)-JSIEC dataset and TC-unseen testing set, respectively, compared to the F1 score of 92.20%, 80.69% and 64.74% by the standard AI model. Furthermore, UIOS correctly predicts high uncertainty scores, which would prompt the need for a manual check in the datasets of non-target categories retinal diseases, low-quality fundus images, and non-fundus images. UIOS provides a robust method for real-world screening of retinal anomalies.


Assuntos
Anormalidades do Olho , Doenças Retinianas , Humanos , Inteligência Artificial , Algoritmos , Incerteza , Retina/diagnóstico por imagem , Fundo de Olho , Doenças Retinianas/diagnóstico por imagem
6.
Artigo em Inglês | MEDLINE | ID: mdl-37368806

RESUMO

In-memory deep learning executes neural network models where they are stored, thus avoiding long-distance communication between memory and computation units, resulting in considerable savings in energy and time. In-memory deep learning has already demonstrated orders of magnitude higher performance density and energy efficiency. The use of emerging memory technology (EMT) promises to increase density, energy, and performance even further. However, EMT is intrinsically unstable, resulting in random data read fluctuations. This can translate to nonnegligible accuracy loss, potentially nullifying the gains. In this article, we propose three optimization techniques that can mathematically overcome the instability problem of EMT. They can improve the accuracy of the in-memory deep learning model while maximizing its energy efficiency. Experiments show that our solution can fully recover most models' state-of-the-art (SOTA) accuracy and achieves at least an order of magnitude higher energy efficiency than the SOTA.

7.
Diagnostics (Basel) ; 13(8)2023 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-37189498

RESUMO

Chest X-rays (CXRs) are essential in the preliminary radiographic assessment of patients affected by COVID-19. Junior residents, as the first point-of-contact in the diagnostic process, are expected to interpret these CXRs accurately. We aimed to assess the effectiveness of a deep neural network in distinguishing COVID-19 from other types of pneumonia, and to determine its potential contribution to improving the diagnostic precision of less experienced residents. A total of 5051 CXRs were utilized to develop and assess an artificial intelligence (AI) model capable of performing three-class classification, namely non-pneumonia, non-COVID-19 pneumonia, and COVID-19 pneumonia. Additionally, an external dataset comprising 500 distinct CXRs was examined by three junior residents with differing levels of training. The CXRs were evaluated both with and without AI assistance. The AI model demonstrated impressive performance, with an Area under the ROC Curve (AUC) of 0.9518 on the internal test set and 0.8594 on the external test set, which improves the AUC score of the current state-of-the-art algorithms by 1.25% and 4.26%, respectively. When assisted by the AI model, the performance of the junior residents improved in a manner that was inversely proportional to their level of training. Among the three junior residents, two showed significant improvement with the assistance of AI. This research highlights the novel development of an AI model for three-class CXR classification and its potential to augment junior residents' diagnostic accuracy, with validation on external data to demonstrate real-world applicability. In practical use, the AI model effectively supported junior residents in interpreting CXRs, boosting their confidence in diagnosis. While the AI model improved junior residents' performance, a decline in performance was observed on the external test compared to the internal test set. This suggests a domain shift between the patient dataset and the external dataset, highlighting the need for future research on test-time training domain adaptation to address this issue.

8.
iScience ; 26(4): 106546, 2023 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-37123247

RESUMO

Genomic researchers increasingly utilize commercial cloud service providers (CSPs) to manage data and analytics needs. CSPs allow researchers to grow Information Technology (IT) infrastructure on demand to overcome bottlenecks when combining large datasets. However, without adequate security controls, the risk of unauthorized access may be higher for data stored on the cloud. Additionally, regulators are mandating data access patterns and specific security protocols for the storage and use of genomic data. While CSP provides tools for security and regulatory compliance, building the necessary controls required for cloud solutions is not trivial. Research Assets Provisioning and Tracking Online Repository (RAPTOR) by the Genome Institute of Singapore is a cloud-native genomics data repository and analytics platform that implements a "five-safes" framework to provide security and governance controls to data contributors and users, leveraging CSP for sharing and analysis of genomic datasets without the risk of security breaches or running afoul of regulations.

9.
Front Psychol ; 14: 1136448, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37057174

RESUMO

Purpose: This study explores the association between the duration and variation of infant sleep trajectories and subsequent cognitive school readiness at 48-50 months. Methods: Participants were 288 multi-ethnic children, within the Growing Up in Singapore Towards healthy Outcomes (GUSTO) cohort. Caregiver-reported total, night and day sleep durations were obtained at 3, 6, 9, 12, 18, 24 using the Brief Infant Sleep Questionnaire and 54 months using the Child Sleep Habits Questionnaire. Total, night and day sleep trajectories with varying durations (short, moderate, or long) and variability (consistent or variable; defined by standard errors) were identified. The cognitive school readiness test battery was administered when the children were between 48 and 50 months old. Both unadjusted adjusted analysis of variance models and adjusted analysis of covariance models (for confounders) were performed to assess associations between sleep trajectories and individual school readiness tests in the domains of language, numeracy, general cognition and memory. Results: In the unadjusted models, children with short variable total sleep trajectories had poorer performance on language tests compared to those with longer and more consistent trajectories. In both unadjusted and adjusted models, children with short variable night sleep trajectories had poorer numeracy knowledge compared to their counterparts with long consistent night sleep trajectories. There were no equivalent associations between sleep trajectories and school readiness performance for tests in the general cognition or memory domains. There were no significant findings for day sleep trajectories. Conclusion: Findings suggest that individual differences in longitudinal sleep duration patterns from as early as 3 months of age may be associated with language and numeracy aspects of school readiness at 48-50 months of age. This is important, as early school readiness, particularly the domains of language and mathematics, is a key predictor of subsequent academic achievement.

10.
Front Public Health ; 11: 1063466, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36860378

RESUMO

Purpose: The COVID-19 pandemic has drastically disrupted global healthcare systems. With the higher demand for healthcare and misinformation related to COVID-19, there is a need to explore alternative models to improve communication. Artificial Intelligence (AI) and Natural Language Processing (NLP) have emerged as promising solutions to improve healthcare delivery. Chatbots could fill a pivotal role in the dissemination and easy accessibility of accurate information in a pandemic. In this study, we developed a multi-lingual NLP-based AI chatbot, DR-COVID, which responds accurately to open-ended, COVID-19 related questions. This was used to facilitate pandemic education and healthcare delivery. Methods: First, we developed DR-COVID with an ensemble NLP model on the Telegram platform (https://t.me/drcovid_nlp_chatbot). Second, we evaluated various performance metrics. Third, we evaluated multi-lingual text-to-text translation to Chinese, Malay, Tamil, Filipino, Thai, Japanese, French, Spanish, and Portuguese. We utilized 2,728 training questions and 821 test questions in English. Primary outcome measurements were (A) overall and top 3 accuracies; (B) Area Under the Curve (AUC), precision, recall, and F1 score. Overall accuracy referred to a correct response for the top answer, whereas top 3 accuracy referred to an appropriate response for any one answer amongst the top 3 answers. AUC and its relevant matrices were obtained from the Receiver Operation Characteristics (ROC) curve. Secondary outcomes were (A) multi-lingual accuracy; (B) comparison to enterprise-grade chatbot systems. The sharing of training and testing datasets on an open-source platform will also contribute to existing data. Results: Our NLP model, utilizing the ensemble architecture, achieved overall and top 3 accuracies of 0.838 [95% confidence interval (CI): 0.826-0.851] and 0.922 [95% CI: 0.913-0.932] respectively. For overall and top 3 results, AUC scores of 0.917 [95% CI: 0.911-0.925] and 0.960 [95% CI: 0.955-0.964] were achieved respectively. We achieved multi-linguicism with nine non-English languages, with Portuguese performing the best overall at 0.900. Lastly, DR-COVID generated answers more accurately and quickly than other chatbots, within 1.12-2.15 s across three devices tested. Conclusion: DR-COVID is a clinically effective NLP-based conversational AI chatbot, and a promising solution for healthcare delivery in the pandemic era.


Assuntos
COVID-19 , Aprendizado Profundo , Humanos , Processamento de Linguagem Natural , Inteligência Artificial , Pandemias , Índia
11.
Nat Genet ; 55(2): 178-186, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36658435

RESUMO

Precision medicine promises to transform healthcare for groups and individuals through early disease detection, refining diagnoses and tailoring treatments. Analysis of large-scale genomic-phenotypic databases is a critical enabler of precision medicine. Although Asia is home to 60% of the world's population, many Asian ancestries are under-represented in existing databases, leading to missed opportunities for new discoveries, particularly for diseases most relevant for these populations. The Singapore National Precision Medicine initiative is a whole-of-government 10-year initiative aiming to generate precision medicine data of up to one million individuals, integrating genomic, lifestyle, health, social and environmental data. Beyond technologies, routine adoption of precision medicine in clinical practice requires social, ethical, legal and regulatory barriers to be addressed. Identifying driver use cases in which precision medicine results in standardized changes to clinical workflows or improvements in population health, coupled with health economic analysis to demonstrate value-based healthcare, is a vital prerequisite for responsible health system adoption.


Assuntos
Atenção à Saúde , Medicina de Precisão , Humanos , Singapura , Medicina de Precisão/métodos , Ásia
12.
JMIR Form Res ; 7: e38555, 2023 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-36649223

RESUMO

BACKGROUND: The 2019 novel COVID-19 has severely burdened the health care system through its rapid transmission. Mobile health (mHealth) is a viable solution to facilitate remote monitoring and continuity of care for patients with COVID-19 in a home environment. However, the conceptualization and development of mHealth apps are often time and labor-intensive and are laden with concerns relating to data security and privacy. Implementing mHealth apps is also a challenging feat as language-related barriers limit adoption, whereas its perceived lack of benefits affects sustained use. The rapid development of an mHealth app that is cost-effective, secure, and user-friendly will be a timely enabler. OBJECTIVE: This project aimed to develop an mHealth app, DrCovid+, to facilitate remote monitoring and continuity of care for patients with COVID-19 by using the rapid development approach. It also aimed to address the challenges of mHealth app adoption and sustained use. METHODS: The Rapid Application Development approach was adopted. Stakeholders including decision makers, physicians, nurses, health care administrators, and research engineers were engaged. The process began with requirements gathering to define and finalize the project scope, followed by an iterative process of developing a working prototype, conducting User Acceptance Tests, and improving the prototype before implementation. Co-designing principles were applied to ensure equal collaborative efforts and collective agreement among stakeholders. RESULTS: DrCovid+ was developed on Telegram Messenger and hosted on a cloud server. It features a secure patient enrollment and data interface, a multilingual communication channel, and both automatic and personalized push messaging. A back-end dashboard was also developed to collect patients' vital signs for remote monitoring and continuity of care. To date, 400 patients have been enrolled into the system, amounting to 2822 hospital bed-days saved. CONCLUSIONS: The rapid development and implementation of DrCovid+ allowed for timely clinical care management for patients with COVID-19. It facilitated early patient hospital discharge and continuity of care while addressing issues relating to data security and labor-, time-, and cost-effectiveness. The use case for DrCovid+ may be extended to other medical conditions to advance patient care and empowerment within the community, thereby meeting existing and rising population health challenges.

13.
Sleep ; 46(2)2023 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-36355436

RESUMO

STUDY OBJECTIVES: Examine how different trajectories of reported sleep duration associate with early childhood cognition. METHODS: Caregiver-reported sleep duration data (n = 330) were collected using the Brief Infant Sleep Questionnaire at 3, 6, 9, 12, 18, and 24 months and Children's Sleep Habits Questionnaire at 54 months. Multiple group-based day-, night-, and/or total sleep trajectories were derived-each differing in duration and variability. Bayley Scales of Infant and Toddler Development-III (Bayley-III) and the Kaufman Brief Intelligence Test- 2 (KBIT-2) were used to assess cognition at 24 and 54 months, respectively. RESULTS: Compared to short variable night sleep trajectory, long consistent night sleep trajectory was associated with higher scores on Bayley-III (cognition and language), while moderate/long consistent night sleep trajectories were associated with higher KBIT-2 (verbal and composite) scores. Children with a long consistent total sleep trajectory had higher Bayley-III (cognition and expressive language) and KBIT-2 (verbal and composite) scores compared to children with a short variable total sleep trajectory. Moderate consistent total sleep trajectory was associated with higher Bayley-III language and KBIT-2 verbal scores relative to the short variable total trajectory. Children with a long variable day sleep had lower Bayley-III (cognition and fine motor) and KBIT-2 (verbal and composite) scores compared to children with a short consistent day sleep trajectory. CONCLUSIONS: Longer and more consistent night- and total sleep trajectories, and a short day sleep trajectory in early childhood were associated with better cognition at 2 and 4.5 years.


Assuntos
Desenvolvimento Infantil , Duração do Sono , Lactente , Humanos , Pré-Escolar , Cognição
14.
Med Image Anal ; 83: 102664, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36332357

RESUMO

Pneumonia can be difficult to diagnose since its symptoms are too variable, and the radiographic signs are often very similar to those seen in other illnesses such as a cold or influenza. Deep neural networks have shown promising performance in automated pneumonia diagnosis using chest X-ray radiography, allowing mass screening and early intervention to reduce the severe cases and death toll. However, they usually require many well-labelled chest X-ray images for training to achieve high diagnostic accuracy. To reduce the need for training data and annotation resources, we propose a novel method called Contrastive Domain Adaptation with Consistency Match (CDACM). It transfers the knowledge from different but relevant datasets to the unlabelled small-size target dataset and improves the semantic quality of the learnt representations. Specifically, we design a conditional domain adversarial network to exploit discriminative information conveyed in the predictions to mitigate the domain gap between the source and target datasets. Furthermore, due to the small scale of the target dataset, we construct a feature cloud for each target sample and leverage contrastive learning to extract more discriminative features. Lastly, we propose adaptive feature cloud expansion to push the decision boundary to a low-density area. Unlike most existing transfer learning methods that aim only to mitigate the domain gap, our method instead simultaneously considers the domain gap and the data deficiency problem of the target dataset. The conditional domain adaptation and the feature cloud generation of our method are learning jointly to extract discriminative features in an end-to-end manner. Besides, the adaptive feature cloud expansion improves the model's generalisation ability in the target domain. Extensive experiments on pneumonia and COVID-19 diagnosis tasks demonstrate that our method outperforms several state-of-the-art unsupervised domain adaptation approaches, which verifies the effectiveness of CDACM for automated pneumonia diagnosis using chest X-ray imaging.


Assuntos
Teste para COVID-19 , COVID-19 , Humanos
15.
Nat Mach Intell ; 5(7): 799-810, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38706981

RESUMO

Medical artificial intelligence (AI) has tremendous potential to advance healthcare by supporting and contributing to the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving both healthcare provider and patient experience. Unlocking this potential requires systematic, quantitative evaluation of the performance of medical AI models on large-scale, heterogeneous data capturing diverse patient populations. Here, to meet this need, we introduce MedPerf, an open platform for benchmarking AI models in the medical domain. MedPerf focuses on enabling federated evaluation of AI models, by securely distributing them to different facilities, such as healthcare organizations. This process of bringing the model to the data empowers each facility to assess and verify the performance of AI models in an efficient and human-supervised process, while prioritizing privacy. We describe the current challenges healthcare and AI communities face, the need for an open platform, the design philosophy of MedPerf, its current implementation status and real-world deployment, our roadmap and, importantly, the use of MedPerf with multiple international institutions within cloud-based technology and on-premises scenarios. Finally, we welcome new contributions by researchers and organizations to further strengthen MedPerf as an open benchmarking platform.

16.
Artigo em Inglês | MEDLINE | ID: mdl-35969543

RESUMO

Spiking neural networks (SNNs) have advantages in latency and energy efficiency over traditional artificial neural networks (ANNs) due to their event-driven computation mechanism and the replacement of energy-consuming weight multiplication with addition. However, to achieve high accuracy, it usually requires long spike trains to ensure accuracy, usually more than 1000 time steps. This offsets the computation efficiency brought by SNNs because a longer spike train means a larger number of operations and larger latency. In this article, we propose a radix-encoded SNN, which has ultrashort spike trains. Specifically, it is able to use less than six time steps to achieve even higher accuracy than its traditional counterpart. We also develop a method to fit our radix encoding technique into the ANN-to-SNN conversion approach so that we can train radix-encoded SNNs more efficiently on mature platforms and hardware. Experiments show that our radix encoding can achieve 25 × improvement in latency and 1.7% improvement in accuracy compared to the state-of-the-art method using the VGG-16 network on the CIFAR-10 dataset.

17.
Artigo em Inglês | MEDLINE | ID: mdl-35998171

RESUMO

Efficient neural network training is essential for in situ training of edge artificial intelligence (AI) and carbon footprint reduction in general. Train neural network on the edge is challenging because there is a large gap between limited resources on edge and the resource requirement of current training methods. Existing training methods are based on the assumption that the underlying computing infrastructure has sufficient memory and energy supplies. These methods involve two copies of the model parameters, which is usually beyond the capacity of on-chip memory in processors. The data movement between off-chip and on-chip memory incurs large amounts of energy. We propose resource constrained training (RCT) to realize resource-efficient training for edge devices and servers. RCT only keeps a quantized model throughout the training so that the memory requirement for model parameters in training is reduced. It adjusts per-layer bitwidth dynamically to save energy when a model can learn effectively with lower precision. We carry out experiments with representative models and tasks in image classification, natural language processing, and crowd counting applications. Experiments show that on average, 8-15-bit weight update is sufficient for achieving SOTA performance in these applications. RCT saves 63.5%-80% memory for model parameters and saves more energy for communications. Through experiments, we observe that the common practice on the first/last layer in model compression does not apply to efficient training. Also, interestingly, the more challenging a dataset is, the lower bitwidth is required for efficient training.

18.
Med Image Anal ; 81: 102535, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35872361

RESUMO

Accurate skin lesion diagnosis requires a great effort from experts to identify the characteristics from clinical and dermoscopic images. Deep multimodal learning-based methods can reduce intra- and inter-reader variability and improve diagnostic accuracy compared to the single modality-based methods. This study develops a novel method, named adversarial multimodal fusion with attention mechanism (AMFAM), to perform multimodal skin lesion classification. Specifically, we adopt a discriminator that uses adversarial learning to enforce the feature extractor to learn the correlated information explicitly. Moreover, we design an attention-based reconstruction strategy to encourage the feature extractor to concentrate on learning the features of the lesion area, thus, enhancing the feature vector from each modality with more discriminative information. Unlike existing multimodal-based approaches, which only focus on learning complementary features from dermoscopic and clinical images, our method considers both correlated and complementary information of the two modalities for multimodal fusion. To verify the effectiveness of our method, we conduct comprehensive experiments on a publicly available multimodal and multi-task skin lesion classification dataset: 7-point criteria evaluation database. The experimental results demonstrate that our proposed method outperforms the current state-of-the-art methods and improves the average AUC score by above 2% on the test set.


Assuntos
Diagnóstico por Imagem , Dermatopatias , Pele , Bases de Dados Factuais , Humanos , Aprendizado de Máquina , Pele/patologia , Dermatopatias/classificação , Dermatopatias/diagnóstico
19.
Artigo em Inglês | MEDLINE | ID: mdl-35560072

RESUMO

Edge devices demand low energy consumption, cost, and small form factor. To efficiently deploy convolutional neural network (CNN) models on the edge device, energy-aware model compression becomes extremely important. However, existing work did not study this problem well because of the lack of considering the diversity of dataflow types in hardware architectures. In this article, we propose EDCompress (EDC), an energy-aware model compression method for various dataflows. It can effectively reduce the energy consumption of various edge devices, with different dataflow types. Considering the very nature of model compression procedures, we recast the optimization process to a multistep problem and solve it by reinforcement learning algorithms. We also propose a multidimensional multistep (MDMS) optimization method, which shows higher compressing capability than the traditional multistep method. Experiments show that EDC could improve 20x, 17x, and 26x energy efficiency in VGG-16, MobileNet, and LeNet-5 networks, respectively, with negligible loss of accuracy. EDC could also indicate the optimal dataflow type for specific neural networks in terms of energy consumption, which can guide the deployment of CNN on hardware.

20.
IEEE Trans Neural Netw Learn Syst ; 33(2): 798-810, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-33090960

RESUMO

Cross-modal retrieval (CMR) enables flexible retrieval experience across different modalities (e.g., texts versus images), which maximally benefits us from the abundance of multimedia data. Existing deep CMR approaches commonly require a large amount of labeled data for training to achieve high performance. However, it is time-consuming and expensive to annotate the multimedia data manually. Thus, how to transfer valuable knowledge from existing annotated data to new data, especially from the known categories to new categories, becomes attractive for real-world applications. To achieve this end, we propose a deep multimodal transfer learning (DMTL) approach to transfer the knowledge from the previously labeled categories (source domain) to improve the retrieval performance on the unlabeled new categories (target domain). Specifically, we employ a joint learning paradigm to transfer knowledge by assigning a pseudolabel to each target sample. During training, the pseudolabel is iteratively updated and passed through our model in a self-supervised manner. At the same time, to reduce the domain discrepancy of different modalities, we construct multiple modality-specific neural networks to learn a shared semantic space for different modalities by enforcing the compactness of homoinstance samples and the scatters of heteroinstance samples. Our method is remarkably different from most of the existing transfer learning approaches. To be specific, previous works usually assume that the source domain and the target domain have the same label set. In contrast, our method considers a more challenging multimodal learning situation where the label sets of the two domains are different or even disjoint. Experimental studies on four widely used benchmarks validate the effectiveness of the proposed method in multimodal transfer learning and demonstrate its superior performance in CMR compared with 11 state-of-the-art methods.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...